我们研究了普遍存在的动作,即所有动作都有预设执行持续时间的环境中,研究了无模型的多机械加固学习(MARL)。在执行期间,环境变化受到动作执行的影响但不同步。在许多现实世界中,这种设置无处不在。但是,大多数MAL方法都假定推断后立即执行动作,这通常是不现实的,并且可能导致多机构协调的灾难性失败。为了填补这一空白,我们为MARL开发了一个算法的算法框架。然后,我们为无模型的MARL算法提出了一种新颖的情节记忆,legeM。 Legem通过利用代理人的个人经历来建立代理商的情节记忆。它通过解决了通过我们的新型奖励再分配计划提出的具有挑战性的时间信用分配问题来提高多机构学习,从而减轻了非马克维亚奖励的问题。我们在各种多代理方案上评估了Legem,其中包括猎鹿游戏,采石场游戏,造林游戏和Starcraft II微管理任务。经验结果表明,LegeM显着提高了多机构的协调,并提高了领先的绩效并提高了样本效率。
translated by 谷歌翻译
This paper introduces SuperGlue, a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. Assignments are estimated by solving a differentiable optimal transport problem, whose costs are predicted by a graph neural network. We introduce a flexible context aggregation mechanism based on attention, enabling SuperGlue to reason about the underlying 3D scene and feature assignments jointly. Compared to traditional, hand-designed heuristics, our technique learns priors over geometric transformations and regularities of the 3D world through end-to-end training from image pairs. SuperGlue outperforms other learned approaches and achieves state-of-the-art results on the task of pose estimation in challenging real-world indoor and outdoor environments. The proposed method performs matching in real-time on a modern GPU and can be readily integrated into modern SfM or SLAM systems. The code and trained weights are publicly available at github.com/magicleap/SuperGluePretrainedNetwork.
translated by 谷歌翻译
This paper presents a self-supervised framework for training interest point detectors and descriptors suitable for a large number of multiple-view geometry problems in computer vision. As opposed to patch-based neural networks, our fully-convolutional model operates on full-sized images and jointly computes pixel-level interest point locations and associated descriptors in one forward pass. We introduce Homographic Adaptation, a multi-scale, multihomography approach for boosting interest point detection repeatability and performing cross-domain adaptation (e.g., synthetic-to-real). Our model, when trained on the MS-COCO generic image dataset using Homographic Adaptation, is able to repeatedly detect a much richer set of interest points than the initial pre-adapted deep model and any other traditional corner detector. The final system gives rise to state-of-the-art homography estimation results on HPatches when compared to LIFT, SIFT and ORB.
translated by 谷歌翻译
Deep multitask networks, in which one neural network produces multiple predictive outputs, can offer better speed and performance than their single-task counterparts but are challenging to train properly. We present a gradient normalization (GradNorm) algorithm that automatically balances training in deep multitask models by dynamically tuning gradient magnitudes. We show that for various network architectures, for both regression and classification tasks, and on both synthetic and real datasets, GradNorm improves accuracy and reduces overfitting across multiple tasks when compared to single-task networks, static baselines, and other adaptive multitask loss balancing techniques. GradNorm also matches or surpasses the performance of exhaustive grid search methods, despite only involving a single asymmetry hyperparameter α. Thus, what was once a tedious search process that incurred exponentially more compute for each task added can now be accomplished within a few training runs, irrespective of the number of tasks. Ultimately, we will demonstrate that gradient manipulation affords us great control over the training dynamics of multitask networks and may be one of the keys to unlocking the potential of multitask learning.
translated by 谷歌翻译